CSE 599 i : Online and Adaptive Machine Learning Winter 2018 Lecture 6 : Non - stochastic best arm identification

نویسندگان

Kevin Jamieson

Anran Wang

Beibin Li

Brian Chan

Shiqing Yu

Zhijin Zhou

چکیده

Example 1. Imagine that we are solving a non-convex optimization problem on some (multivariate) function f using gradient descent. Recall that gradient descent converges to local minima. Because non-convex functions may have multiple minima, we cannot guarantee that gradient descent will converge to the global minimum. To resolve this issue, we will use random restarts, the process of starting multiple instances of gradient descent from random locations and outputting the least-valued minimum found. If an instance of gradient descent is obviously not going to converge quickly to a low-valued minimum, that instance should be stopped early in favor of other, more-promising ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-stochastic Best Arm Identification and Hyperparameter Optimization

Motivated by the task of hyperparameter optimization, we introduce the non-stochastic bestarm identification problem. Within the multiarmed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the non-stochastic and stochastic settings while to the best of our knowledge, the best-arm identification framework has only been considered in the stochastic settin...

متن کامل

Private Stochastic Multi-arm Bandits: From Theory to Practice

In this paper we study the problem of private stochastic multi-arm bandits. Our notion of privacy is the same as some of the earlier works in the general area of private online learning (Dwork et al., 2010; Jain et al., 2012; Smith and Thakurta, 2013). We design algorithms that are i) differentially private, and ii) have regret guarantees that (almost) match the regret guarantees for the best n...

متن کامل

A New Fuzzy Stabilizer Based on Online Learning Algorithm for Damping of Low-Frequency Oscillations

A multi objective Honey Bee Mating Optimization (HBMO) designed by online learning mechanism is proposed in this paper to optimize the double Fuzzy-Lead-Lag (FLL) stabilizer parameters in order to improve low-frequency oscillations in a multi machine power system. The proposed double FLL stabilizer consists of a low pass filter and two fuzzy logic controllers whose parameters can be set by the ...

متن کامل

Black-Box Reductions for Parameter-free Online Learning in Banach Spaces

We introduce several new black-box reductions that significantly improve the design of adaptive and parameterfree online learning algorithms by simplifying analysis, improving regret guarantees, and sometimes even improving runtime. We reduce parameter-free online learning to online exp-concave optimization, we reduce optimization in a Banach space to one-dimensional optimization, and we reduce...

متن کامل

Multi-Armed Bandits on Unit Interval Graphs

An online learning problem with side information on the similarity and dissimilarity across different actions is considered. The problem is formulated as a stochastic multiarmed bandit problem with a graph-structured learning space. Each node in the graph represents an arm in the bandit problem and an edge between two nodes represents closeness in their mean rewards. It is shown that the result...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

CSE 599 i : Online and Adaptive Machine Learning Winter 2018 Lecture 6 : Non - stochastic best arm identification

نویسندگان

چکیده

منابع مشابه

Non-stochastic Best Arm Identification and Hyperparameter Optimization

Private Stochastic Multi-arm Bandits: From Theory to Practice

A New Fuzzy Stabilizer Based on Online Learning Algorithm for Damping of Low-Frequency Oscillations

Black-Box Reductions for Parameter-free Online Learning in Banach Spaces

Multi-Armed Bandits on Unit Interval Graphs

عنوان ژورنال:

اشتراک گذاری